Search CORE

7 research outputs found

eXTRA: A Culturally Enriched Malay Text to Speech System

Author: Ainon Raja Noor
Lutfi Syaheerah L.
Mohd Don Zuraidah
Montero Martínez Juan Manuel
Publication venue: E.T.S.I. Telecomunicación (UPM)
Publication date: 01/01/2008
Field of study

This paper concerns the incorporation of naturalness into Malay Text-to-Speech (TTS) systems through the addition of a culturally-localized affective component. Previous studies on emotion theories were examined to draw up assumptions about emotions. These studies also include the findings from observations by anthropologists and researchers on culturalspecific emotions, particularly, the Malay culture. These findings were used to elicit the requirements for modeling affect in the TTS that conforms to the people of the Malay culture in Malaysia. The goal is to introduce a novel method for generating Malay expressive speech by embedding a localized ‘emotion layer’ called eXpressive Text Reader Automation Layer, abbreviated as eXTRA. In a pilot project, the prototype is used with Fasih, the first Malay Text-to-Speech system developed by MIMOS Berhad, which can read unrestricted Malay text in four emotions: anger, sadness, happiness and fear. In this paper however, concentration is given to the first two emotions. eXTRA is evaluated through open perception tests by both native and non-native listeners. The results show more than sixty percent of recognition rate, which confirmed the satisfactory performance of the approaches

CiteSeerX

Archivo Digital UPM

Generación de una voz sintética en Castellano basada en HSMM para la Evaluación Albayzín 2008: conversión texto a voz

Author: Barra Chicote Roberto
King Simon
Lutfi Syaheerah L.
Macías Guarasa Javier
Montero Martínez Juan Manuel
Yamagishi J.
Publication venue: E.T.S.I. Telecomunicación (UPM)
Publication date: 01/01/2008
Field of study

Este artículo describe el proceso de generación de una voz en castellano utilizando el corpus UPC ESMA de UPC proporcionado por la Evaluación Albayzín 2008: Conversión Texto a Voz. Se ha implementado una voz basada en selección de unidades mediante el paquete Multisyn de Festival y otra basada en Hidden Semi-Markov Models (HSMM) mediante HTS. Tras una breve evaluación de la calidad de ambas voces, se detallan las características principales de la voz basada en HSMM, sistema final presentado a la evaluación

Archivo Digital UPM

Spanish Expressive Voices: corpus for emotion research in Spanish

Author: Barra Chicote Roberto
Córdoba Herralde Ricardo de
D'haro Enríquez Luis Fernando
Fernández Martínez Fernando
Ferreiros López Javier
Lucas Cuesta Juan Manuel
Lutfi Syaheerah L.
Macías Guarasa Javier
Montero Martínez Juan Manuel
Pardo Muñoz José Manuel
San Segundo Hernández Rubén
Publication venue: E.T.S.I. Telecomunicación (UPM)
Publication date: 01/05/2008
Field of study

A new emotional multimedia database has been recorded and aligned. The database comprises speech and video recordings of one actor and one actress simulating a neutral state and the Big Six emotions: happiness, sadness, anger, surprise, fear and disgust. Due to a careful design and its size (more than 100 minutes per emotion), the recorded database allows comprehensive studies on emotional speech synthesis, prosodic modelling, speech conversion, far-field speech recognition and speech and video-based emotion identification. The database has been automatically labelled for prosodic purposes (5% was manually revised). The whole database has been validated thorough objective and perceptual tests, achieving a validation score as high as 89%

Archivo Digital UPM